MBTI Personality Prediction Using Machine Learning and SMOTE for Balancing Data Based on Statement Sentences

نویسندگان

چکیده

The rise of social media as a platform for self-expression and self-understanding has led to increased interest in using the Myers–Briggs Type Indicator (MBTI) explore human personalities. Despite this, there needs be more research on how other word-embedding techniques, machine learning algorithms, imbalanced data-handling techniques can improve results MBTI personality-type predictions. Our aimed investigate efficacy these by utilizing Word2Vec model obtain vector representation words corpus data. We implemented several approaches, including logistic regression, linear support classification, stochastic gradient descent, random forest, extreme boosting classifier, cat classifier. In addition, we used synthetic minority oversampling technique (SMOTE) address issue showed that our approach could achieve relatively high F1 score (between 0.7383 0.8282), depending chosen predicting classifying personality. Furthermore, found SMOTE selected models’ performance (F1 between 0.7553 0.8337), proving integrated with predict classify personality well, thus enhancing understanding MBTI.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Machine Learning Models for Housing Prices Forecasting using Registration Data

This article has been compiled to identify the best model of housing price forecasting using machine learning methods with maximum accuracy and minimum error. Five important machine learning algorithms are used to predict housing prices, including Nearest Neighbor Regression Algorithm (KNNR), Support Vector Regression Algorithm (SVR), Random Forest Regression Algorithm (RFR), Extreme Gradient B...

متن کامل

Stock Price Prediction using Machine Learning and Swarm Intelligence

Background and Objectives: Stock price prediction has become one of the interesting and also challenging topics for researchers in the past few years. Due to the non-linear nature of the time-series data of the stock prices, mathematical modeling approaches usually fail to yield acceptable results. Therefore, machine learning methods can be a promising solution to this problem. Methods: In this...

متن کامل

Identification Psychological Disorders Based on Data in Virtual Environments Using Machine Learning

Introduction: Psychological disorders is one of the most problematic and important issue in today's society. Early prognosis of these disorders matters because receiving professional help at the appropriate time could improve the quality of life of these patients. Recently, researches use social media as a form of new tools in identifying psychological disorder. It seems that through the use of...

متن کامل

Thermal conductivity of Water-based nanofluids: Prediction and comparison of models using machine learning

Statistical methods, and especially machine learning, have been increasingly used in nanofluid modeling. This paper presents some of the interesting and applicable methods for thermal conductivity prediction and compares them with each other according to results and errors that are defined. The thermal conductivity of nanofluids increases with the volume fraction and temperature. Machine learni...

متن کامل

Machine Learning Based Drug Indication Prediction Using Linked Open Data

In this study, drug and disease features were obtained by querying open linked data to train our classifier for predicting new drug indications, and the predictive performance of the classifier for different validation schemes was evaluated. We collected the drug and disease data from Bio2RDF, an open source project that uses semantic web technologies to link data from multiple sources. A binar...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Information

سال: 2023

ISSN: ['2078-2489']

DOI: https://doi.org/10.3390/info14040217